13 research outputs found

    Unsupervised ensemble of experts (EoE) framework for automatic binarization of document images

    Full text link
    In recent years, a large number of binarization methods have been developed, with varying performance generalization and strength against different benchmarks. In this work, to leverage on these methods, an ensemble of experts (EoE) framework is introduced, to efficiently combine the outputs of various methods. The proposed framework offers a new selection process of the binarization methods, which are actually the experts in the ensemble, by introducing three concepts: confidentness, endorsement and schools of experts. The framework, which is highly objective, is built based on two general principles: (i) consolidation of saturated opinions and (ii) identification of schools of experts. After building the endorsement graph of the ensemble for an input document image based on the confidentness of the experts, the saturated opinions are consolidated, and then the schools of experts are identified by thresholding the consolidated endorsement graph. A variation of the framework, in which no selection is made, is also introduced that combines the outputs of all experts using endorsement-dependent weights. The EoE framework is evaluated on the set of participating methods in the H-DIBCO'12 contest and also on an ensemble generated from various instances of grid-based Sauvola method with promising performance.Comment: 6-page version, Accepted to be presented in ICDAR'1

    Challenges and complexities in application of LCA approaches in the case of ICT for a sustainable future

    Get PDF
    In this work, three of many ICT-specific challenges of LCA are discussed. First, the inconsistency versus uncertainty is reviewed with regard to the meta-technological nature of ICT. As an example, the semiconductor technologies are used to highlight the complexities especially with respect to energy and water consumption. The need for specific representations and metric to separately assess products and technologies is discussed. It is highlighted that applying product-oriented approaches would result in abandoning or disfavoring of new technologies that could otherwise help toward a better world. Second, several believed-untouchable hot spots are highlighted to emphasize on their importance and footprint. The list includes, but not limited to, i) User Computer-Interfaces (UCIs), especially screens and displays, ii) Network-Computer Interlaces (NCIs), such as electronic and optical ports, and iii) electricity power interfaces. In addition, considering cross-regional social and economic impacts, and also taking into account the marketing nature of the need for many ICT's product and services in both forms of hardware and software, the complexity of End of Life (EoL) stage of ICT products, technologies, and services is explored. Finally, the impact of smart management and intelligence, and in general software, in ICT solutions and products is highlighted. In particular, it is observed that, even using the same technology, the significance of software could be highly variable depending on the level of intelligence and awareness deployed. With examples from an interconnected network of data centers managed using Dynamic Voltage and Frequency Scaling (DVFS) technology and smart cooling systems, it is shown that the unadjusted assessments could be highly uncertain, and even inconsistent, in calculating the management component's significance on the ICT impacts.Comment: 10 pages. Preprint/Accepted of a paper submitted to the ICT4S Conferenc

    Carbon-profit-aware job scheduling and load balancing in geographically distributed cloud for HPC and web applications

    Get PDF
    This thesis introduces two carbon-profit-aware control mechanisms that can be used to improve performance of job scheduling and load balancing in an interconnected system of geographically distributed data centers for HPC and web applications. These control mechanisms consist of three primary components that perform: 1) measurement and modeling, 2) job planning, and 3) plan execution. The measurement and modeling component provide information on energy consumption and carbon footprint as well as utilization, weather, and pricing information. The job planning component uses this information to suggest the best arrangement of applications as a possible configuration to the plan execution component to perform it on the system. For reporting and decision making purposes, some metrics need to be modeled based on directly measured inputs. There are two challenges in accurately modeling of these necessary metrics: 1) feature selection and 2) curve fitting (regression). First, to improve the accuracy of power consumption models of the underutilized servers, advanced fitting methodologies were used on the selected server features. The resulting model is then evaluated on real servers and is used as part of load balancing mechanism for web applications. We also provide an inclusive model for cooling system in data centers to optimize the power consumption of cooling system, which in turn is used by the planning component. Furthermore, we introduce another model to calculate the profit of the system based on the price of electricity, carbon tax, operational costs, sales tax, and corporation taxes. This model is used for optimized scheduling of HPC jobs. For position allocation of web applications, a new heuristic algorithm is introduced for load balancing of virtual machines in a geographically distributed system in order to improve its carbon awareness. This new heuristic algorithm is based on genetic algorithm and is specifically tailored for optimization problems of interconnected system of distributed data centers. A simple version of this heuristic algorithm has been implemented in the GSN project, as a carbon-aware controller. Similarly, for scheduling of HPC jobs on servers, two new metrics are introduced: 1) profitper-core-hour-GHz and 2) virtual carbon tax. In the HPC job scheduler, these new metrics are used to maximize profit and minimize the carbon footprint of the system, respectively. Once the application execution plan is determined, plan execution component will attempt to implement it on the system. Plan execution component immediately uses the hypervisors on physical servers to create, remove, and migrate virtual machines. It also executes and controls the HPC jobs or web applications on the virtual machines. For validating systems designed using the proposed modeling and planning components, a simulation platform using real system data was developed, and new methodologies were compared with the state-of-the-art methods considering various scenarios. The experimental results show improvement in power modeling of servers, significant carbon reduction in load balancing of web applications, and significant profit-carbon improvement in HPC job scheduling
    corecore